-
Notifications
You must be signed in to change notification settings - Fork 696
RPATH Fix for portable_lib Python Extension #14422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14422
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 446d855 with merge base 0f22062 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
| BUILD_RPATH "@loader_path/../../../torch/lib" | ||
| INSTALL_RPATH "@loader_path/../../../torch/lib" | ||
| ) | ||
| else() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregoryComer Do you think this will work for windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I was planning to test the windows wheel on a local machine. I'll see if we run into any issues.
|
Fixing lint. Other than that, CI looks good. I took a look at the previous cherry-pick attempt to main (#13290) and it looks like the build failures were due to CMake export changes. That's resolved now. |
RPATH Fix for portable_lib Python Extension
Problem: The _portable_lib.so Python extension built on CI couldn't find
PyTorch libraries when installed locally because it had hardcoded
absolute paths from
the CI build environment.
Error:
ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002):
Library not loaded: @rpath/libtorch_python.dylib
Referenced from:
.../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so
Reason: tried:
'/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such
file)
Root Cause: The CMake build was linking to PyTorch libraries using
absolute paths from the build environment, without setting proper
relative RPATHs for
runtime library resolution.
Solution: Added platform-specific relative RPATH settings to the
portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines
657-669):
- macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries
relative to the .so file location
- Linux: Uses $ORIGIN/../../../torch/lib for the same purpose
- Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency
Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch
libraries regardless of the installation location, fixing the runtime
linking issue
when using ExecutorTorch wheels built on CI.
Note: The same fix may be needed for _training_lib if it experiences
similar issues.
Test Plan:
```
# Build the wheel locally
python setup.py bdist_wheel
# create fresh conda env
conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11
# install
pip install ./dist/executorch-*.whl
# Verify
python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')"
```
2c4f7ba to
446d855
Compare
|
Thank you! |
|
I verified that the wheel built from this job, installed into a clean env, can load pybindings on macos. I also tested on Windows on a clean env, and I can't load the _portable_lib dll, though it's also broken on the release branch. I'll file as release blocking task. |
|
@pytorchbot cherry-pick --onto release/1.0 -c critical |
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]> (cherry picked from commit 641e737)
Cherry picking #14422The cherry pick PR is at #14442 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]> (cherry picked from commit 641e737)
**Note: This is an attempt to cherry-pick Mergen's RPATH fix from pytorch#13254 onto main. Fixes pytorch#14421. Original description below.** Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from the CI build environment. Error: ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib Referenced from: .../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried: '/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file) Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for runtime library resolution. Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669): - macOS: Uses @loader_path/../../../torch/lib to find PyTorch libraries relative to the .so file location - Linux: Uses $ORIGIN/../../../torch/lib for the same purpose - Sets both BUILD_RPATH and INSTALL_RPATH to ensure consistency Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue when using ExecutorTorch wheels built on CI. Note: The same fix may be needed for _training_lib if it experiences similar issues. Test Plan: ``` # Build the wheel locally python setup.py bdist_wheel # create fresh conda env conda create -yn executorch_test_11 python=3.11.0 && conda activate executorch_test_11 # install pip install ./dist/executorch-*.whl # Verify python -c "from executorch.extension.pybindings._portable_lib import _load_for_executorch; print('Success!')" ``` Co-authored-by: Mergen Nachin <[email protected]>
Note: This is an attempt to cherry-pick Mergen's RPATH fix from #13254 onto main. Fixes #14421. Original description below.
Problem: The _portable_lib.so Python extension built on CI couldn't find PyTorch libraries when installed locally because it had hardcoded absolute paths from
the CI build environment.
Error:
ImportError: dlopen(.../_portable_lib.cpython-311-darwin.so, 0x0002): Library not loaded: @rpath/libtorch_python.dylib
Referenced from:
.../executorch/extension/pybindings/_portable_lib.cpython-311-darwin.so Reason: tried:
'/Users/runner/work/_temp/.../torch/lib/libtorch_python.dylib' (no such file)
Root Cause: The CMake build was linking to PyTorch libraries using absolute paths from the build environment, without setting proper relative RPATHs for
runtime library resolution.
Solution: Added platform-specific relative RPATH settings to the portable_lib target in /Users/mnachin/executorch/CMakeLists.txt (lines 657-669):
Impact: This allows the wheel-packaged _portable_lib.so to find PyTorch libraries regardless of the installation location, fixing the runtime linking issue
when using ExecutorTorch wheels built on CI.
Note: The same fix may be needed for _training_lib if it experiences similar issues.
Test Plan: